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Counting bats 

Itai Benjamini Gady Kozma 



^— ^ ■ Assume G is an infinite graph ( "the cave" ) which is recurrent for the simple random walk 

CN . (SRW). Several independent walkers ("the bats") are performing SRW on G simultaneously 

Vh I with the same clock with starting vertex o. G is not known to you (hence the cave metaphor), 

it is too dark to see G). The only information given to you is the set of return times to o, 
though you do not know how many walkers returned at any given time, only if this number 
is or positive. Can you almost surely tell how many walkers are there, by only observing 
the times o is occupied? 

Pm I Theorem. Almost surely it is possible to tell how many walkers are there, by observing the 

. ' times is occupied. 

'j^ . Formally, there is a function ^ : {0, 1}^ — )> N which, given the visits of the walkers 

outputs their number, and is correct with probability 1. Again, sz/ does not depend on the 
graph. For more bat-related results, see our paper [1]. 

Corollary. There is no pair of recurrent infinite graphs so that the return times of two 
independent SRW's on one of the graphs are absolutely continuous to the return times of 
"Tj" ' one SRW on the other graph. 

fT^ , Problem. The algorithm in the proof seems far from efficient. Give lower and upper bounds 

t:;;;}^ [ and suggest improved or optimal algorithms. 

("the algorithm" here is simply the function =2/, which is not really an algorithm in 
the computer sense: it does not "run" or "stop". Nevertheless one can find reasonable 
algorithmic versions of the problem and investigate them) 

S^ I Problem. We do not know if reversibility is important (it is definitely used in our proof). 

So we ask: is there an algorithm that gives the right number of walkers for any recurrent 
Markov chain, without knowledge of the Markov chain? 

Proof 

The first step is to reconstruct the distribution of returns of a single walker, no matter how 
many walkers one actually examines. 

Lemma 1. There is an algorithm to reconstruct 

p{n) = P(a single walker returns to at time n) 
with no knowledge of the graph structure. 
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Proof. We fix some large Ti and wait till you see a time interval of length > Ti with no visits 
to o and look at the first return after this long returns-free interval, denoted by Si. Let Ei 
be the event that there is a return at Si + n. Continue similarly: choose some large T2, let 
S2 be the first time after si + n when a returns-free intervals longer than T2 finished, and let 
E2 be the event that there is a return at S2 + n. Etc. For concreteness, fix Tj = 2*. 

Write now Ei = GiU Bi where Gi (the "good" event) is the event that the walker which 
returned at time Sj also returned at Si + n, and Bi (the "bad" event) is the event that another 
walker returned at time Sj + n. We would have liked to sample Gj, which are exactly i.i.d. 
variables with probability p(n), but we can only sample Ei. So we need to show that Bi are 
rare. 

The key observation follows from a quantitative non concentration of return times estab- 
lished in [3]: on any graph, the probability that a walker returned for the first time at time 
t, conditioned on not having returned before t, is < {G\ogt)/t. Denote therefore -Bj(j) the 
event that it is the j*^ walker that returned at time Si + n (i? is a "bad" event, so we assume 
that the j^^ walker did not return at time Sn or that there were two returns at s„,). Let r be 
the last visit of the j^^ walker to before Sj. By definition, this means that r < Sj — 2*. We 
can now write 

P(i?j(j) I Sj, r) = P(a walker returned at time Sj — r + n 

I not returning in the first Si — r steps) 

n 

< \ P(a walker returned at time Si — r + j 

j=0 

I not returning in the first Si — r + j — 1 steps) 
log(sj - r + j) Gni 



n 



Integrating over r and Sj gives that P(i?j(j)) < Gni2~\ and summing over j (which has k 
possibilities, where k is the (unknown) number of walkers) gives P(-Bj) < Gkm2^\ We see 
that these numbers are summable, so only a finite number of Bi occur. This means that 
p{n) may be calculated by 

which is the algorithm sought for. D 

With the distribution of returns estimated, we now have a relatively easy task: we have a 
known variable, the number of returns of a single walker. We are given a sample of a union of 
k independent copies of it and we need to estimate k. Taking the number of actual returns up 
to time t and dividing by the (known) expectation for a single walker, would give a variable 
with expectation k. It would be natural to assume that if we repeat this experiment with 
times ti growing sufficiently fast, the resulting variables would be approximately independent 
and hence it would be possible to calculate k by the limit of the running average. The only 



difficulty is to explain what does "sufficiently fast" means, and it turns out that this must 
depend on the graph. The following lemma essentially claims that this scheme works if one 
takes ti to be the median of the (2*)*^ return of a single walker to o. 

Lemma 2. Let X„ be i.i.d. N-valued random variables, and let Sn be the corresponding 
random walk 

Sn = Xi + . . . + Xn- 

Let Mn be the median of Sn i-e. 

Mn = Med(S'„) ^n = max{i : F{Si < Mn) > ^}. 

Define 

Yn = - max{z : Si < M„} 
n 

Then 

^ Y 



-E 



n=l 



as N ^ oo. 



Proof. By definition, Med y„ = 1. It is easy to conclude from that that Yn form a precompact 
(tight) family of variables. Indeed, 

Yn> \ ^^ max{z : Si < Af„} > nX <^=^ ^LnAj+i < Mn . 

Since S is an increasing random walk, to be smaller than M„ at time nX its increments must 
be smaller than Mn on any block of variables between and nX, and disjoint blocks are 
independent. So we get 

P(F„ > A) = P(5l„aj+i < Mn) < F{X^k-i)n+i + --- + Xkn< Mn-^k < [X\) < 2"^. (1) 
We will use the second moment method so we need to estimate 

E(y„r„) - E(F„)E(i;„) 

say, ioi m < n (both will be powers of two but let us not record this fact in the notation). 
Let therefore A = [(n/m)^/^J. Define the event l3S (the "bad" event) to be the event that 
one of the following happened: 

1. Yn, > A. 

2. SmY„^+i > Mnix'2 (note the +1 in the index — we are taking here the ffist time Si raises 
above M^)- 



The probability of the first clause is estimated by (1) to be < 2~^, so let us estimate the 
probability of the second minus the first, i.e. the probability that Y^ < A but SmY,n+i > 
M^x^- Because S is increasing, if mF^ < ''^A then mF^ + 1 < m\ and then S'^y^+i < S^x 
so we can write 



P(y^ < A, S-^y^ + i > M„A2) < nSmX > Mm.X2) =: p. 



But then we can write 
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2 < HSmx^ < M^aO 

< P (X(,,_i)„A+l + ■ ■ ■ + Xkrra < M^^2 Wk < A) 



so p < C/X. Totally we get 

F{M) < C/X. 

With the estimate (1) this gives 



E(K„1^) <^P(^n{A;-l <r„< A;})-A;< ^CA;min{-,2-''} < ^^ ■ (2) 



fc=i fc=i 



We need a similar estimate for YnY^lag and for this we need to estimate ]E(y„ \^r\{k — 1 < 
Ym < k}. We write nYn = niYm + 1 + Z and note that Z is the number of steps our random 
walk needed to get from SmYm+i to n so it is stochastically dominated by nYn, even after 
conditioning over i^ fl {/c — 1 < Ym < k} (which is an event that looks at the random walk 
only up to mYm + !)• So 

HI 

E{Yn\lJgn{Ym = y})<y- + C 

n 

which we use to show 

oo 

¥.{YnYml.^) < Y^ P(^ r\{k-l<Ym<k})-k- {k{m/n) + C) 

k=l 

<f:cemin{'2'^}<^^^. (3) 

fc=i 

This finishes our treatment of the event ^. 

We now restrict our attention to -i^. Let therefore u be some atom of the a-field spanned 

by Fm, -^1, • • • , XmYm+i such that u <^ J^, and write 

Tn 1 

E{Yr,\u) = Ym- — + -E(max{z : X^y„+2 + ■ ■ ■ + X^y^+i+i < M„ - (5™y„+i)} | u) (4) 

The first term we bound by C /X^ (because of the first clause in the definition of ^). For the 
second, we note that X2™y„+25 • • • has the same distribution as Xi, . . . (again, conditioning 



over u does not change this fact) so this term is bounded above by E,Yn and bounded below 
by 

-E(max{i ■.S^<Mn- M^^^}) 
n 

by the second clause in the definition of J3§. Hence we need to estimate the variable 

max{z : Si < M^} - max{z : Si < M„ - Mmx^} = \{i : M^ - Mm\2 < Si < M„}| 

But this variable is stochastically dominated simply by mX'^Yjnx'2 because it is the number 
of steps our random walk needs to traverse an interval < M^x^. Combining both parts of 
(4) gives 

E(F„) - C/\ < EiYnltu) < C/\ + E(F„) 

which we multiply by Y^ and integrate over -i^ to get 

\E{Y^Yr,U^) -E{Yr,)E{Y^U^)\ < jE{Y^U.j) < j. 



With (2), (3) we get 



|E(F„F„) - E(i;„)E(F„)l < ^l^i^^. (5) 



This finishes the lemma: define 

N 



and estimate YA^. We get 
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N 

yAn = 5^ vr2» + 2 Yl ^°^(^2» ' ^2^ ) 

N 

<^C + 2 Y, C-2^'~^'>^^\i-j\^ <CN 

i=l l<i<j<N 

where the bound for YY2t comes from the exponential decay (1), and the bound for the 
covariances is exactly (5), recall that A was defined by [(n/m)^/^J. On the other hand, 
Medy2* = 1 so EY21 > | and EA^- > ^N. This gives that A^/EA^ is concentrated. Using 
Markov's inequality gives 



P 



An _^ 



EA 



N 



c 

> e] < F{\An - EAn\ > ceN) < -^ 



e^N 



This means that these events happen only finitely many times on any reasonable subsequence 
(e.g. N"^) and due to EY2i ~ 1 and monotonicity of A^ the convergence may be extended 
from a subsequence to all A^. D 

We are almost done, we just need to handle double returns to o, for which we have the 
following simple lemma. 



Lemma 3. Let Xi and X2 he two independent walkers on an infinite graph G, and lett>0. 

Then 

/ N 2/3 

H\{s < t : X,{s) = X^is) = o}\) < C(e{\{s < t : X^{s) = o}|)j . 

Proof. Denote M = E(|{s < t : Xi{s) = o}\). On any infinite graph, P(X2(s) = o) < C/^/s 
(see e.g. [2]). Hence we write 

t 



E{\{A'p/^<s<t:Xi{s)=X2{s) = o}\)= J^ nM^) = of 

C 



t ^ t 



y P(Xi(s) = o) ■ ^ < CAf-^/=^ y P(Xi(s) = o) = CM2/3. 
^-^ ws ^-^ 

= M2/3 ^ S=l 



< 

=M2/3 



Since the number of visits up to time M^/^ is definitely bounded by M^/^, we are done. D 

The theorem now follows easily. By lemma 1 we may calculate the median M„ of the n^^ 
return of a single walker to 0. Defining 

Kj = -I {visits of walker i until M„}| Yn = y^ K! 

i 

we can use lemma 2 (the random walk S in lemma 2 is defined by 5"^ being the time of the 
n^^ visit of the walker to o and then the M„ of lemma 2 are the same as here, and the Y^ of 
lemma 2 are the Y^ here). We get 

1 ^ Y 

N ^ EF^„ 

We cannot measure Yn directly, since if two walkers returned to o at the same time, they 
contribute 2 to the sum but we cannot see that. Nevertheless, if we define 

Yn = ^\{l<t<M^:3j,X,{t)=o}\ 

then Yn can be measured, and 

\Yn - i;| < ^ 5Z 1(1 < t < M„ : Xi{t) = X,{t) = o}\ 

and each term is bounded by lemma 3 by (E(ny„))^'^/n. Since EY^ < C this gives that 

\Yn-Yn\<Ck''n-^'\ 
We get 

A^ ^ EK*„ 

n=l ^ 

and the theorem is proved. D 



Acknowledgements 

Both authors supported by their respective Israel Science Foundation grants. 

References 

[1] Itai Benjamini, Gady Kozma, Laszlo Lovasz, Dan Romik and Gabor Tardos, Waiting 
for a bat to fly by (in polynomial time). Combinatorics, Probabihty and Computing, 
15:5 (2006), 673-683. Available at: arXiv: math/0310435, cambridge.org 

[2] Thierry Coulhon, Random walks and geometry on infinite graphs. In: Lecture notes 
on analysis on metric spaces, Trento, C.I.R.M., 1999. Luigi Ambrosio, Francesco 
Serra Cassano, ed., Scuola Normale Superiore di Pisa, (2000) 5-30. Available at: 
coulhon . u- cergy . f r 

[3] Ori Gurel-Gurevich and Asaf Nachmias, Non- concentration of return times, Annals of 
Probability, to appear. Available at: arXiv: 1009. 1438 



Weizmann Institute 

Rehovot, Israel 

E-MAIL: itai.benjainini@weizmann.ac.il ; gady.kozma@weizmajin.ac.il 



